7 Variational Autoencoders

7.1 Training

Variational encoders are separately trained for each bird. To determine the optimal number of embedding dimensions, I calculated the Calinski-Harabasz index, or the ratio of the between-cluster variance to the within-cluster variance, using the pre-labelled clusters (fig 7.1). Bird 7358 (66-68 DPH) has relatively stable syllables and song syntax, while bird 6951 (59-63 DPH) has more variable syllables and syntax 8.1. For bird 7358, little information is gained beyond 32 dimensions.

Figure 7.1: Reconstruction loss and Calinski-Harabasz Index.

Variational autoencoders
Input (left) and decoded (right) syllables.

Figure 7.2: Input (left) and decoded (right) syllables.

Traversing the embedding space from the centroid of syllable "i" to each other syllable centroid.

Figure 7.3: Traversing the embedding space from the centroid of syllable ā€œiā€ to each other syllable centroid.

7.2 Syllable Clustering

Syllable clusters from embedded dimensions.Syllable clusters from embedded dimensions.

Figure 7.4: Syllable clusters from embedded dimensions.

7.3 Bird 7358

Figure 7.5: UMAP projection of song trajectory with neuron spikes shown as dots.